Analysing the Effects of Reward Shaping in Multi-Objective Stochastic Games

نویسندگان

  • Patrick Mannion
  • Jim Duggan
  • Enda Howley
چکیده

The majority of Multi-Agent Reinforcement Learning (MARL) implementations aim to optimise systems with respect to a single objective, despite the fact that many real world problems are inherently multi-objective in nature. Research into multi-objective MARL is still in its infancy, and few studies to date have dealt with the issue of credit assignment. Reward shaping has been proposed as a means to address the credit assignment problem in single-objective MARL, however it has been shown to alter the intended goals of the domain if misused, leading to unintended behaviour. Two popular shaping methods are Potential-Based Reward Shaping and difference rewards, and both have been repeatedly shown to improve learning speed and the quality of joint policies learned by agents in single-objective problems. In this work we discuss the theoretical implications of applying these approaches to multi-objective problems, and evaluate their efficacy using a new multi-objective benchmark domain where the true Pareto optimal system utilities are known. Our work provides the first empirical evidence that agents using these shaping methodologies can sample true Pareto optimal solutions in multi-objective Stochastic Games.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Theoretical and Empirical Analysis of Reward Transformations in Multi-Objective Stochastic Games

Reward shaping has been proposed as a means to address the credit assignment problem in Multi-Agent Systems (MAS). Two popular shaping methods are Potential-Based Reward Shaping and difference rewards, and both have been shown to improve learning speed and the quality of joint policies learned by agents in single-objective MAS. In this work we discuss the theoretical implications of applying th...

متن کامل

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

We extend the potential-based shapingmethod fromMarkov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.

متن کامل

Decidability Results for Multi-objective Stochastic Games

We study stochastic two-player turn-based games in which the objective of one player is to ensure several infinite-horizon total reward objectives, while the other player attempts to spoil at least one of the objectives. The games have previously been shown not to be determined, and an approximation algorithm for computing a Pareto curve has been given. The major drawback of the existing algori...

متن کامل

PRISM-Games 2.0: A Tool for Multi-objective Strategy Synthesis for Stochastic Games

We present a new release of PRISM-games, a tool for verification and strategy synthesis for stochastic games. PRISM-games 2.0 significantly extends its functionality by supporting, for the first time: (i) long-run average (mean-payoff) and ratio reward objectives, e.g., to express energy consumption per time unit; (ii) strategy synthesis and Pareto set computation for multi-objective properties...

متن کامل

Multi-agent, reward shaping for RoboCup KeepAway

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory [2], potential-based reward shaping does not alter the Nash Equilibria of a stochastic game, only the exploration of the shaped agent. We demonstrate empirically the performance of statebased and state-action-based reward shaping in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017